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Abstract 

We propose a technique to derive upper bounds on Gallager’s 
cost-constrained random coding exponent function. Applying this 
technique to the non-coherent peak-power or average-power limited 
discrete time memoryless Ricean fading channel, we obtain the high 
signal-to-noise ratio (SNR) expansion of this channel’s cut-off rate. 

At high SNR the gap between channel capacity and the cut-off rate 
approaches a finite limit. This limit is approximately 0.26 nats per 
channel-use for zero specular component (Rayleigh) fading and ap¬ 
proaches 0.39 nats per channel-use for very large specular components. 

We also compute the asymptotic cut-off rate of a Rayleigh fading 
channel when the receiver has access to some partial side information 
concerning the fading. It is demonstrated that the cut-off rate does 
not utilize the side information as efficiently as capacity, and that the 
high SNR gap between the two increases to infinity as the imperfect 
side information becomes more and more precise. 

Keywords: Asymptotic, channel capacity, cnt-off rate, fading, high SNR, 
Ricean fading. 

1 Introduction 

This paper addresses the computation of a function that is key to the evalu¬ 
ation of both the random coding and sphere packing error exponents. This 
function, often denoted Eo{g), is usually expressed as a maximization prob¬ 
lem over input distributions. Consequently, it is conceptually easily bounded 
from below: any feasible input distribution gives rise to such a bound. In 
this paper we propose to use a dual expression for Eq^q) — an expression 
that involves a minimization over output distributions — in order to derive 
upper bounds on Eo{g). We shall demonstrate this approach by studying 
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the cutoff rate of non-coherent Ricean fading channels. To that end we shall 
have to study the appropriate modihcations to the function Eq^q) that are 
needed to account for input constraints and when the channel input and 
output alphabets are infinite. 

It should be noted that the dual expression we propose to use is not new 
Q, 0 Ex. 23 in Ch. 2.5]. We merely extend it here to input constrained 
channels over infinite alphabets and demonstrate how it can be used to derive 
analytic upper bounds on the random coding and sphere packing error expo¬ 
nents. For numerical procedures (for unconstrained finite alphabet channels) 
see jS]. 

The rest of this introductory section is dedicated to the introduction of 
the function Eo{g) for discrete memoryless channels. We hrst treat uncon¬ 
strained channel and then introduce the modihcations that are needed to 
account for input constraints. We describe both the “method of types” ap¬ 
proach and Gallager’s approach. We pay special attention to the modihcation 
that Gallager introduced to account for cost constraints and to the duality 
between the expressions derived using the two approaches. This introduction 
is somewhat lengthy because, while the results are not new, we had difficulty 
pointing to a publication that introduces the two approaches side by side and 
that compares the two in the presence of cost constraints. 

In Section El we extend the discussion to inhnite alphabets and prove the 
basic inequality on which our approach to upper bounding Eq^q) is based; 
see Proposition ^ In Section El we introduce the discrete-time memoryless 
Ricean fading channel with and without full or partial side information at 
the receiver, and we describe our asymptotic results on this channel’s cutoff 
rate. These asymptotic results are derived using duality in Section 01 which 
concludes the paper. 

1.1 Unconstrained Inputs 

To motivate the interest in the function Eq{q) we shall begin by addressing 
the case where there are no input constraints. The reliability function E(E) 
corresponding to rate-R unconstrained communication over a discrete mem¬ 
oryless channel (DMG) of capacity G > R is the best exponential decay in 
the blocklength n of the average probability of error that one can achieve 
using rate-R blocklength-n codebooks. That is, 

R(R) = Iffi^ --logPe(n,R) (1) 

n—>oo 72 

where Pe(n, R) denotes the average probability of error of the best rate-R 
blocklength-n codebook for the given channel. 
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The problem of computing the reliability function of a general DMC over 
the finite input and output alphabets X and y and of a general law W(?/|a;) is 
still open. Various upper and lower bounds are, however, known. To derive 
lower bounds on the reliability function one must derive upper bounds on the 
probability of error of the best rate-i? blocklength-n code. This is typically 
done by demonstrating the existence of good codes for which the average 
probability of error is small. One such lower bound on E{R) is the random 
coding lower bound jl]. By considering an ensemble of codebooks whose 
codewords are chosen independently, each according to a product distribution 
of marginal law Q, Gallager derived the lower bound 


E{R)>Eg{R,Q) 

where 

(2) 

Eq{R, Q) = max {^g,o(2, Q) - qR} 

0<Q<1 

and 

/ y+e 

(3) 

EgaM) = -logj^ 5 ^Q(x)W(|/|x)^ 

y&y \x&X ) 

(4) 

Since the law Q from which the ensemble of codebooks is 
arbitrary, Gallager obtained the bound 

constructed is 

E{R) > EgAR) 

where i?G,r(-R) is Gallager’s random coding error exponent 

(5) 

EgAR) — ^8.xEg{R, Q) 

Q 

(6) 

= max max {EgaA^ Q) ~ qR}- 

Q o<e<i ’ 

(7) 


A different random coding lower bound on the reliability function can be 
derived using the ensemble of codebooks where the codewords are still chosen 
independently, but rather than according to a product distribution, each is 
now chosen uniformly over a type class j2l 2.5], p], jH]- With this approach 
one obtains 0 2.5], m the lower bound 

E{R)>Eck{RA) ( 8 ) 

where 

Eck{R, Q) = min {D(V|| W|Q) + |/(Q, V) - + } . (9) 
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Here the minimization is over all conditional laws 


V(2/|a;) > 0, '^y{y\x) = l,WxeX; 

(10) 

y&y 


L)(V||W|Q) = ^Q(x)Zl(V(-|a;)]|W(-|a;)) 

(11) 

x&X 


-5^Q(x)5^V(»|x)log < 1 ]; 

x&X y&y ' 

(12) 


the term /(Q, V) denotes the mutual information corresponding to the chan¬ 
nel V and the input distribution Q; and stands for max{^,0}. Again, 
since the type Q according to which the ensemble is generated is arbitrary, 
one obtains 

E{R) > EcK,viR) (13) 

where 


EcK,r{R) = maxEcK(-R, Q) (14) 

Q 

= max min {D(V||W|Q) + |/(Q,V)-fl|+}. (15) 

There is an alternative form for Eck{R, Q) that will be of interest to us 
m, a Ex. 23 in Ch. 2.5]. This form is more similar to 


Eck{R, Q) 


max {Eck,o 

0<g<l 


(p, Q) - qR} 


(16) 


where 


i^CK,o(^,Q) = min{Zl(V||W|Q) + p/(Q, V)} 

= min < —(1 -|- g) Q{x) log ( W(|/|x)^ R(i/) 

I X(^x \y&y 


i+e 


(17) 

(18) 


and where the minimization in the latter is over the set of all distributions 
R on the output alphabet 3^. 

In general, for any DMC \N{y\x) and any input distribution Q jTj, |2 Ex. 
23 in Ch. 2.5] 

-Eck,o(^, Q) > -Eg,o(^, Q), P>0 (19) 

and hence 

EcKiR, Q) > Eq{R, Q) (20) 
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with the inequalities typically being strict. These inequalities are a conse¬ 
quence of the fact that the “average constant composition code” performs 
better than the “average independent and identically distributed code” p. 
However, when optimized over the input distributions, the inequalities turn 
into equalities P, [Zj, (3 Ex. 23 in Ch. 2.5] 



maxT;cK,o(^, Q) = maxEG,o(p, Q), ^ > 0 

Q Q 

(21) 

and 


max Eck {R,Q) = max Eq {R,Q) 

Q Q 

(22) 

i.e.. 


EcK,r{R) — EG,r{R)- 

(23) 


In fact, as shown in Appendix El the optimization problems appearing on 
the LHS and on the RHS of Lagrange duals. 

Consequently, we shall henceforth denote maxg Eck,o{q, Q) (= maxQ Eq^q^q, Q)) 
by Eo{g) and refer to EG,r{R) (= EcK,r{R)) random coding error ex¬ 

ponent and denote it by Et-{R). In terms of the function Eo{-) the random 
coding error exponent E^{R) is thus given by 


EAR) = max{Eo(p) — qR}. 
o<e<i ^ 

The cut-off rate Rq is dehned by 


Rf) — Eq{q) 


Q=l 


(24) 


(25) 


The function Eq{q) also plays an important role in the study of upper 
bounds to the reliability function. In fact, the sphere packing error exponent 
Esp(i?) is given by P 


Esp(i?) = max{Eo(p) - qR}. 


(26) 


Combining m with m and dl we obtain the two equivalent expres¬ 
sions for Eq^q) 


Eo{g) 


max 


Q 


log 5^ 

y&y 



(27) 


-E'o(p) = max min 
Q R 


■(1 + ^) X]Q( 2 :) log 





(28) 
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We refer to the former expression as the “primal” expression and to the latter 
as the “dnal” expression. The primal expression is nsefnl for the derivation 
of lower bounds on Eq^q). Indeed, any distribution Q on the input alphabet 
X induces the lower bound 

Eo{g) > -log^ I ^Q(a;)W(|/|x)^ 

y&y VxeA’ 

On the other hand, the dual expression is useful for the derivation of upper 
bounds. Any distribution R on the output alphabet 3^ yields the upper bound 



Eo{g) < max I -(1 + p) Q(a:) log [ ^ W(|/|a:) i+e R(?/) 


Q 

1 + e 




yy&y 


max < —(1 + g) log | y^ \N{y\x) i+g R{y) i+g 
\yey 


(30) 

(31) 


1.2 Constrained Inputs 

Before we can use the above bounds for fading channels we need to extend 
the discussion to cost constrained channels and to channels over inhnite input 
and output alphabets where the method of types cannot be directly used. 
For now we continue our assumption of hnite alphabets and address the cost 
constraint. 

Suppose we limit ourselves to blockcode transmissions where we only 
allow codewords (xi,..., x„) that satisfy 

n 

'^g{x()<nT (32) 

£=i 

where g : X ^ M’*' is a cost function on the input alphabet X, T is some pre- 
specihed non-negative number, and n, as before, is the blocklength. The reli¬ 
ability function E{R) is dehned as in with the modihcation that Pe(’^, R) 
should be now understood as the lowest average probability of error that can 
be achieved using a rate-i? blocklength-n codebook all of whose codewords 
satisfy the cost constraint. 

To obtain lower bounds on E{R) Gallager |3], [Hj modihed his random 
coding argument in two ways. He introduced a new ensemble of codebooks 
and introduced an improved technique to analyze the average probability of 
error over this ensemble. For any probability law Q on the input alphabet 
satisfying 

Eq|j(A')| < T (33) 
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where 


(34) 




x£X 


define 


EU(^,Q) = 


'EG,o(^?,Q) ifEQ[(7(X)]<T 

maxEo(^, Q,r) if Eq[ 5 ((X)] = T 

r>0 


where 

Eo{g,Q,r) = 
Note that 

and hence 


yey \x£X 


Eo{g,Q,r) =EGfl{g,Q) 


i+i? 


lr=0 

pM 


niaxEo(^, Q,r) > E^^^ig^Q) > EG,oig,Q)- 

r>0 

Thus, Gallager’s “modification” can only tighten the bound. 
Gallager then showed that for any 0 < < 1 the exponent 


BG,oteQ)-£>fi 


(35) 


(36) 


(37) 

(38) 


is achievable using block codes that satisfy the constraint. 

(To prove this result when EQ[ 5 f(X)] < T he considered an ensemble 
of codebooks where the codewords are chosen independently of each other, 
each according to the a-posteriori law of a sequence Xi,... drawn IID 
according to Q conditional on < XT. To prove the result when 

EQ[ 5 f(X)] = T he considered an ensemble similarly constructed but with the 
distribution being conditional on nT — 5 < di^k) < nT.) 

Gonsequently the error exponent 






(39) 


where 

T) ^ max Q) (40) 

Q;Eq[c/(X)]<T 

is achievable. 

It is instructive to distinguish between two types of constraints. We say 
that the cost constraint is inactive if there exists some input distribution Q* 
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satisfying the constraint that achieves the global unconstrained maximum of 
Eg, 0 ( 0 , Q)- That is, 

3Q* : EQ*[g{X)]<T and Eg,o(^, Q*) = maxEG,o(^', Q) (41) 

W 

or equivalently 


^ Q) = niaxEG,o(^, Q). (42) 

Q:Eq[3(V)]<T Q 

Otherwise, we say that the cost constraint is active. With these dehnitions 
it can be shown that dUl) simplihes to 


^g.o(2,T) 


max max Enip.Q.r) 

Q:EQ[g(X)]=T r>0 

maxEG,o(^, Q) 

w 


cost active 
cost inactive 


(43) 


(The case where the cost constraint is active follows from Gallager’s obser¬ 
vation that when the cost constraint is active, the maximum of Eo{Q,Q,r) 
over all r > 0 and over all laws Q satisfying (El is achieved by an input 
distribution Q* satisfying the constraint with equality. The case where the 
cost constraint is inactive follows by noting that by starting from we 
have for inactive cost constraints 


o F mWKT ^G,o(2, Q) > _ p max Eg, 0 ( 0 , Q) 

Q:Eq[5(X)]<T Q:Eq[^(X)]<T 

= maxEG,o(^, Q) 

w 

= m^EcK,o(2, Q) 

Q:Eq[3(X)]<T 


SO that all inequalities must hold with equalities. Here the first inequal¬ 
ity follows from (j38|) : the subsequent equality because the cost constraint 
is assumed inactive (IH; the subsequent equality from (EH); and the hnal 
inequality from dH ahead.) 

An achievable error exponent can also be demonstrated using constant 
composition codes. This yields that the error exponent 

EGK,r{R, T) ^ max {Eck,o{0, T) - gR} (44) 

0<g<l 

is achievable where 

^ck,o( 2, T)= max Eck,o{0,Q)- (45) 

Q:EQ[g(X)]<T 


The relation not withstanding, it can be shown that for any law Q 
satisfying (El and any > 0 

EcKfi{0,Q) > E^^,{g,Q) (46) 

with the inequality being, in general, strictd Consequently, by dH and dH 
we have Eck,o{q,^) > -^Go(^)d'). However, as shown in Appendix iBl this 
holds with equality 

EoK.o(£>,T) = B“„(i),T). (47) 

Thus, denoting the two identical functions EQQ{g,T) and Ecka^Q,^) by 
EQ{g,T) and the two identical functions Ec^ ^iR, d') and E^^^{R, T) by E,.{R, T) 
we have 

E,{R, T) = max {Eo{g, T) - gR} (48) 

o<e<i 

where EQ{g,T) can be expressed either by (111 as 

{ max maxii^o (^'5 Qj''") cost active 

Q:Eo[,(X)]=T .>0 

maxii^G,o(^?, Q) cost inactive 

Q 

or, using (HHD , as 
Eo{g,r) = 

max min < —(1 + p) Q(a;) log I W(?/|a;) R(j/)i+e 

Q:Eq[<?(x)]<t 

The former, to which we refer as the “primal” expression, is useful for the 
derivation of lower bounds on Eo{g,T) whereas the latter, the “dual”, is 
useful for upper bounds. 

2 Continuous Alphabets 

We next extend the discussion to channels over infinite input and output 
alphabets. Consider a channel W{-\-) whose inputs and outputs take value 
in the separable metric spaces X and 3^ respectively. Thus for any input 
X E X and any Borel set H C 3^ the probability that in response to the input 
X the channel will produce an output Y that lies in the set B is W{B\x). We 

^In the case Eq[( 7(X)] < T this follows directly from (12011 . For a proof in the case 
EQ[g(X)] = T see Proposition ^ ahead, which proves that the RHS of (ITlll is greater or 
equal E^Q{g,Q). 
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assume that the mapping x i—*• W{B\x) from X to the interval [0,1] is Borel 
measurable. Finally assume the existence of an underlying positive measure 
H ony with respect to which all the probability measures {W{-\x),x E X} 
are absolutely continuous. Denote the Radon-Nykodim derivative of W{-\x) 
with respect to /i by 


w{-\x) 


dW{-\x) 
d/i ’ 


X E X. 


Thus, w{y\x) is the density at y of the channel output corresponding to the 
input X E X. For any input x E X and any Borel set B C y 


W{B\x)= / w{y\x)dy{y). (51) 

Jb 

As to the cost, we shall assume that the function g : X —y R"*" is mea¬ 
surable and consider block codes that satisfy (IS2D. We extend the dehnition 
(El to inhnite alphabets as 


Eq[^ 7 (W)]^ [ g{x)dQ{x). (52) 

Jx 

Dehnition dSl is extended for any probability law Q on A as 


Eo{g,Q,r) =-log [ ([ ’^^M;(|/|a;)dQ(x)^ dfi{y). (53) 

Jy(^y \Jx&x J 

For any input distribution Q satisfying the constraint EQ[ 5 f(X)] < T we 
extend dSl as follows: 


supEo(p, Q,r) 


Eo{q,Qx) 


r=0 


if Ea| 9 (A')] = T and Ea|g3(A')] < oo 

. (54) 

otherwise 


(Note that following Gallager H, jHI we allow for the optimization over r only 
when under the law Q the random variable g{X) has a hnite third moment.) 
With this dehnition we can now dehne 


Eo{g,T)^ sup E^,ig,Q) 

Q:Eq[3(X)]<T 


and the cut-oh rate as 


Ro{T) ^ Eo{g,T) 


Q=1 


(55) 


(56) 
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The random coding error exponent 


snp {Eo{g,T)-gR} 


0<Q<1 


is achievable with block codes satisfying the constraint m i, iHi- 

The following proposition proves 1)4(1 j) in the more general case where the 
alphabets may be continnons. It is particularly useful for the derivation of 
upper bounds on T). 

Proposition 1. Consider as above a discrete-time memoryless infinite al¬ 
phabet channel w{y\x), an output measure fi, a measurable cost function 
g : X —!■ M+, and some arbitrary allowed cost T. Let /r be an arbitrary 
density with respect to fi on the output alphabet y. Then for any distribution 
Q on X satisfying the cost constraint EQ[ 5 f(X)] < T 





(57) 


Proof. Distinguish between the case where EQ[ 5 f(X)] < T and the case where 
EQ[ 5 f(X)] = T and EQ[ 5 f^(X)] < oo. In the former case, by (IH^ . 
pQ^g, Q, 0) and the result follows by an application of Jensen’s inequality and 
Holder’s inequality: 



Eo{q, Q, 0). 


As for the case where EQ[ 5 f(X)] = T (and EQ[ 5 f^(X)] < cx)) we have for 
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any r > 0 


-(1 + g) 


log 


'xex 


1 g \ 

w{y\x)~s dfi{y) dQ(a:) 


'y&y 


= r(l + ^)(EQ[^(X)]-T) 


- (1 + ^) 

= -(1 + ^) 


log 


' xGX 


Ag{x) '^)y;(y|a;)i+,y^(y)i+e d/i(j/) ) dQ(x 


log 


Jxex 

> -(l + ^)log I 

> -log / 


■Jy&y 

[ fniy)^ dy{y)\ dQ(a;) 

'y&y J 


1^7/1-T*^ 1 + jO -F^flAl + P 

rjnf /j. I 1 + 0 


e-v.v-^. ^'w{y\x)^+sfR{y)^+s dy{y) dQ(a;) 


l + £> 


^w(l/|a;) 1+e dQ(x) ) dy{y) 


Jy&y \JxeX 

= Eo(s,Q,r). 


(58) 


where the second eqnality follows becanse in the case we are considering now 
EQ[ 5 f(X)] = T; the hrst ineqnality by Jensen’s ineqnality, and the snbseqnent 
by Holder’s ineqnality. The resnlt for this case now follows becanse r > 0 in 
the above is arbitrary. □ 

To conclnde, to derive lower bonnds on T^o(^?) 'll') we can choose any inpnt 
distribntion Q satisfying the constraint EQ[ 5 f(X)] < T to obtain the lower 
bonnd: 

Eo(g,T)>E^^o(^,Q) (59) 

where i?Q Q(p, Q) is defined in 

To derive npper bonnds on Eo(g, T) we can nse the above proposition by 
choosing some arbitrary ontpnt density fniy) to obtain 


Eo(g,r)< 


snp 

Q:Eq[9(X)]<T 


-(1 + g) 


log 


'xex 


/ w(yjx) !+<? fn(y) i+^ d/u(y) ) dQ(a;) 
'yey / 


(60) 


3 Ricean Fading Channels 

The discrete-time memoryless Ricean fading channel with partial receiver 
side information is a channel whose inpnt x takes valne in the complex held 
C and whose corresponding ontpnt constitntes of a pair of complex random 
variables Y and S. We shall refer to Y as “the received signal” and to S 


12 


as the “side information (at the receiver)”. The joint distribution of Y, S 
corresponding to the input a; G C is best described using the fading complex 
random variable H and the additive noise complex random variable Z. 

The joint distribution of if, S', and Z does not depend on the input x. 
The additive noise Z is independent of the pair (ii, S) and has a circularly 
symmetric complex Gaussian distribution of positive variance cr^. The fading 
H is of mean d G C — the “specular component” — and it is assumed that 
ii — d is a unit-variance circularly symmetric complex Gaussian random 
variable.^ The pair S and H — d are jointly circularly symmetric Gaussian 
random variables. We denote the conditional variance of H given S by e^. 

The received signal Y corresponding to the input a; G C is given by 

Y = Hx + Z. (61) 

The case where = 1 corresponds to the case where H and S are inde¬ 
pendent, in which case the receiver can discard S without loss in information 
rates. This case corresponds to “non-coherent” fading. In the case = 0 
the receiver can precisely determine the realization of H from S. This cor¬ 
responds to “coherent detection”. Finally, the case 0 < e < 1 corresponds to 
“partially coherent” communication. In this case S carries some information 
about ii, but it does not fully determine H. In this paper we shall only 
consider the case where > 0. The case = 0 is much easier to analyze 
and has already received considerable attention in the literature. See for 
example, Q. nm. un and the references in the latter. 

The special case of Ricean fading with zero specular component d is called 
“Rayleigh fading”. The non-coherent (e^ = 1) capacity of this channel was 
studied in [T2], [IHl and [H]. The coherent case (e^ = 0) was studied in p. 
The capacity of the non-coherent Ricean channel (e^ = 1 and d ^ 0) was 
studied in and m- 

Unless some restrictions are imposed on the input x, the capacity and cut¬ 
off rate of this channel are inhnite. Two kinds of restrictions are typically 
considered. The hrst corresponds to an average power constraint. Here only 
blockcodes where each codeword satishes with 

g{x) = |xp (62) 

^We shall sometimes refer to such Ricean fading as “normalized Ricean fading” to 
make it explicit that the fading is of unit variance. “Un-normalized” Ricean fading need 
not have unit-variance. Those can be normalized by scaling the fading and absorbing the 
scaling into the input power. Note also that there is no loss in generality in assuming that 
d is real and non-negative. The more general complex case can be treated by rotating the 
output. 


13 



are allowed. In this context rather than denoting the allowed cost by T 
we shall use the more common symbol £, which stands here for the average 
energy per symbol. That is, we only allow blocklength-n codes in which every 
codeword xi,... ,Xn satisfies 

1 ” 

(63) 

e=i 

The second type of constraint is a peak power constraint. Here we only 
allow channel inputs that satisfy 

|xp < £ (64) 

where £ now stands for the allowed peak power. Such a constraint is best 
treated by considering the channel as being free of constraints but with the 
input alphabet now being {z & £1, ■. \z\^ <£}. 

For both the average and peak power constraints we define the signal-to- 
noise ratio (SNR) as 

SNR ^ 4 . (65) 

a"' 

Any codebook satisfying the peak power constraint dni also satisfies the 
average power constraint hence the capacity and reliability function under 
the peak constraint cannot exceed those under the average constraint. 

Irrespective of whether an average power or a peak power constraint is 
imposed, at high SNR the capacity C'(SNR|S') of this channel is given asymp¬ 
totically as 

C'(SNRIR) = loglogSNR + log|dp -Ei(-|d|2) - 1 + logl + o(l) (66) 

where the correction term o(l) depends on the SNR and tends to zero as the 
SNR tends to inhnity. Here Ei(-) denotes the Exponential Integral function 

roo -t 

Ei{-0 = -J^ —dt, e>0 (67) 

and we define the value of the function log(^) — Ei(—^) at ^ = 0 as — 7 , 
where 7 ~ 0.577 denotes Euler’s constant. (With this definition the function 
log(0 “ Ei(—^) is continuous from the right at ^ = 0.) 

Here we shall study the cutoff rate in two cases. First, in the absence of 
side information (e^ = 1) we will show that irrespective of whether a peak or 
average power constraint is imposed 

Ro(SNR) = log log SNR + ^ - log(27r) - 21oglo 
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+ 0 ( 1 ). ( 68 ) 



Figure 1: The second order terms of C'(SNR) and -Ro(SNR) and their differ¬ 
ence as functions of the specular component \d\ for e = 1, i.e., in the absence 
of side information. Upper curve depicts lim {C'(SNR) — loglogSNR}, 

SNR—>c» 

followed by the analogous term for the cutoff rate and their difference. 


Here lo(-) denotes the zero-th order modihed Bessel function of the hrst kind, 
which is given by 


MO ^ ^ AD, (69) 

and the o(l) term is a correction term that depends on the SNR and that 
approaches zero as the SNR tends to inhnity. 

Figure n depicts the second order term (the constant term) in the high 
SNR expansion of channel capacity and of the cutoff rate (P|) as a 
function of the specular component d in the absence of side information. For 
a zero specular component the difference between the two second order terms 
is log(27r) — 1 — 7 Ri 0.26 nats; for very large specular components (|(i| —> cx)) 
this difference approaches log(4/e) ~ 0.39 nats.^ 

For the case where the side information is present but is not perfect 
(0 < < 1) we only treat the case of zero specular component (d = 0, i.e., 

^All logarithms in this paper are natural logarithms. 


15 







































































Rayleigh fading). We obtain the expansion 


Ro(SNR|^) = log log SNR + log ^ - log K (Vl - e^) -log4 + o(l) 

0 < < 1, d = 0 (70) 


where K(-) is the complete elliptic integral of the hrst kind: 


K(0 = f 

Jo 




dt, 


e<i. 


(71) 


For the case of Rayleigh fading with perfect side information (e^ = 0) see 
HD]. For the case of “almost perfect side information” (0 < -C 1) we note 
the expansion 


log ^ - log K (yi- - log 4 = log ^ - log log ^ - log 4 + o{e^) 

0 < < 1. (72) 

which follows from the approximation HH 

■'W = ‘“s 0<'=<l (73) 

for some 

1 — 

0<e< . (74) 

4 

Fignre El depicts the second order terms of channel capacity (lF)H|l and the 
cutoff rate (HOI) as a function of the estimation error in estimating the 
fading from the side information for Rayleigh fading channels (<7 = 0). 


4 Derivations for Ricean Channels 


4.1 The Cut-Off Rate in Absence of Side Information 

4.1.1 Upper Bound 

To derive an upper bound on the cut-off rate of the Ricean channel in the 
absence of side information we use Proposition^ with the density (w.r.t. the 
Lebesgue measure fi on C) 


fniy) 


W + S) 


a-lg-- 


7r/7"F(Q;, d/P) 


yeC. 


(75) 
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The Estimation Error 


Figure 2: The second order terms of C(SNR), i?o(SNR) and their difference 
as functions of the minimum mean squared error in estimating the fading 
from the side information. Rayleigh fading {d = 0) is assumed. 
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Here the parameters 5 > 0, a > 0, and /? > 0 can be chosen freely in order 
to obtain the tightest bonnd, and r(Q!,,^) denotes the incomplete Gamma 
function, 

/ CXD 

a>0,^>0. (76) 

(This family of densities was introdnced in [H] for the pnrpose of stndying 
the fading nnmber.) 

By Proposition C] applied with p = 1 we obtain for any law Q nnder which 


Eq[|.Y|=] <f 

the npper bonnd 


E^q{1,Q)<-2 [ logV’(x)dQ(a:) 

Jx£C 


where 




ly&C 


^/w{y\x) ■ dy{y) 


2 e 2/3 e TPP+oT' 


^r(a,|)/32y'|a:|2 + a2 


£{x; a, j3, S) 


(77) 

(78) 

(79) 

(80) 


and from jlHl 3.338] 


£{x; a, P, (5) 


p2(/3+|rg|2+o-2) 

g 2l3(\x\'-‘ + a-^) p(p^ — Ip 


ndpjx\_p\ 

\ \xP + cr^ / 


dp. 


(81) 


For onr high SNR analysis it will snffice to consider (for snfhciently large 
powers £) the possibly snb-optimal choice of the parameters 

p = 8\og8 a = -^— (82) 

logP 

and to consider the limiting behavior of the bonnd as T —/• cxo. After taking 
this limit with 5 > 0 held hxed we shall consider the additional limit of 
5^0. 

The analytic compntation of £(x;a, P,6) is difficnlt. Note, however, that 
any lower bonnd to this qnantity will yield an npper bonnd on i7Qg(l,Q). 
Also, the integral is compntable when both a and 6 are formally set to zero.^ 

^In fact, it suffices that 6 be set to zero. 
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We can thus use a limiting argument to study i{x; a, /?, 6) for a, 6 very small. 
Indeed, in Appendix O h is shown that 


i{x; a, P, (5) > a{a, P, 6, mi) ■ i{x] a = 0, P,6 = 0) 


where 


a{a, P, 6, mp 


^«/2 


mi 


mi + 1 



V^'lo 


7r/3(T^ 

2(/3+^h 


(83) 


(84) 


mi > 0 being some constant. As we shall see, the term a{a, P,6,mi) will 
have a negligible asymptotic contribution to our bound. 

The term i{x; a = 0, P,S = 0) can be computed analytically |TH1 6.618]: 


i{x; a 


0,P,6 = 0) 


I P{\xP + aP 

V 2 y /? + \xP + cr^ 


• g4(|a:p+<T^)(/3+|a!p + o-^) 


f (d\dp\ 

a; ■ 


V4(k| 

|2 + a2)(/? + 

a; 

|2 + a2); 


(85) 


We thus conclude from dZHI) , dHni), (IHSD, and 


Eq PI.Q) < - - 2 log a(Q;,/?, 5 , mi) + alog^ 


TlogT ( a, - ) -log(27r) 


log 1 + 


|X |2 + a' 


P 


\d\^E 




|X |2 + a 2 

\d? |Xp 


/3 


/3+ |X|2 + a2 


P 


_ 2 \Xp + a^ p+\Xp + a^ 


- 2 log lo 


(\d\^ P Y 

V 4 \Xp + a^p+\Xp + ay_ 


The expectations in the above cannot be computed without knowledge of 
the law Q. We thus proceed to upper bound the expectations using the 
average power constraint (|77|). The expectation of the logarithm is up¬ 
per bounded using Jensen’s inequality and the power constraint the 

following expectation is upper bounded using the point-wise upper bound 
|a;p/(|a:p -f ap < 1, Jensen’s inequality, and the power constraint (f77jl : and 
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the final expectation by noting that the function ^ ^ ^ — 2 log Iq ('C/2 ) is 
monotonically increasing and by noting that 


^ , Ml' 

2 |X|2 + a2/3+|X|2 + a2 2 ‘ 

We thus conclude that with the allowed average power S the cut-off rate 
satisfies: 


s s 

Ro{S) - log log — < ^ - 2\oga{a,f3,6,mi) + a\ogl3 






logh ( a, -11 - log log 


-h log ( 1 -h 

+ (1 


£ + 

~T~ 

P MJ|2 






^ + £ + a^ 


- 21oglo 




- log(27r). 


Holding 5 > 0 (small) and mi > 0 (large) fixed, and letting £ ^ oo with 
a = a{£) and j3 = P{£) as in we obtain from the above and 


_ £ 

lim {Ro{£) - log log ^} < log 


>oo 




log 


mi -|- 1 
mi 

1 - e-^ 
5 


- 2 log 1 - 


-Iq 


/mi6 


2a 




where in computing the limiting difference between the Incomplete Gamma 
function and log log we used nn Appendix XI]. Holding mi fixed and 
letting 5 ^ 0 we obtain 


_ S 

lim {Rq{£)- log log ^} < log 




mi -|- 1 
mi 




log(27r). 
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Letting now mi tend to infinity we obtain the desired asymptotic upper 
bound 

lim{i?o(^)-loglog-^} < ^-21oglo - log(27r). (86) 

s^oo a 2 y 4 y 


4.1.2 Lower Bound 


Any input distribution satisfying the cost constraint (possibly strictly) in¬ 
duces a lower bound on the cut-off rate (jini)- Indeed, for any input distribu¬ 
tion Q satisfying the cost constraint 


Ro{T) > EI,{qA) 


> Eo(p, Q,r) 


Q=l,r=0 


(87) 

( 88 ) 


where the hrst inequality follows by the dehnition of the cut-off rate (jnni) 
(and holds with equality if Q achieves the cut-off rate) and where the second 
inequality follows from (and holds with equality if Q satishes the cost 
constraint with strict inequality). 

We thus proceed to lower bound L^o(l 5 Q 5 0) for a law Q of our choice. 
Under this law, X is a circularly symmetric random variable with 


log|Xp ~ Uniform (log loglog£^). (89) 


The motivation for using this law is that it is known to achieve the asymptotic 
capacity [T^. Moreover, this law also satishes the peak power constraint 
|Xp < S, so that the lower bound on the cut-off rate we compute will also 
be valid as a lower bound for the cut-off rate under a peak constraint. Finally, 
as the next proposition shows, the fact that under Q the input X satishes, 
with probability one, |X| > Xmin, where x^in oo greatly simplihes our 
analysis. It allows us to asymptotically ignore the additive noise. 

Proposition 2. Let Eq{ 1,Q,0) denote the function EQ{p,Q,r) evaluated at 
p = 1, r = 0 for the input law Q to the Ricean channel of specular component 
d and additive noise variance Let Q, 0) be similarly defined for the 

Ricean channel with the same specular component but without any additive 
noise. If under the law Q the input X G C satisfies with probability one 


X 


> X 


min 


for some Xmin > 0 then 

Eo(l, Q, 0) > Er\i: Q, 0) - o (. (90) 

V ^min / 
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Proof. For any input probability distribution Q, the term Q) 0) can be 
expressed 


Eo(l,Q,0) = - log 
= - log 



X J x' Jy 



^/w{y\x)w{y\x') dy{y) dQ(x') dQ(x) 
x'; a) dQ(x') dQ(a;) (91) 


where 


B(x,x'-,a) = / ^/w(y\x)w(y\x') 


(92) 


and where for the Ricean fading channel with additive noise of variance 


B{x, x'] a) = 


v 

x\ 

^ + (y‘^\J\x' ^ -f 


x' 

|2 + 

X 

2 + 2 a 2 


g 2(|xp + |a;^P+2cr2) 


(93) 


Comparing B{x,x']a) with the corresponding term in the absence of noise 
B{x,x';0) we obtain 

B{x, x'; a) 

< B{x, x'\ 0 )\/\ + cr^/lx^y/l + (94) 

< -B(a;, x'\ 0)'\/l + (95) 

where the last inequality follows by the triangle inequality. It thus follows 
from (ED and (ED that if under the law Q the random variable X satisfies 
with probability one |X| > then 

Eo(i,Q,o)>i?o"=°(i,Q,o) 

— sup < log a /1 + (T^/lxP + log \/l + 

|x|,|x'|>a;min [ 

+ |d| V 


(kl 

+ 

\x'\f 

(kl 

P + 

\x'\ 

2 -I- 2(j2)( X 

|2 + 

\x'\ 

?) 


= Er\iM,d)-o{{\d\^+i)/xi^). 


□ 

Using this proposition with the law Q under which X is distributed ac¬ 
cording to dHHD we obtain that 


lim 

S^OQ 


{fi„(5)-Bo"°(l.Q.O)} 


> 0. 


(96) 
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Computing £'q“°( 1,Q,0) from (FTTIl and we obtain 


Q, 0) = - log 8 + ^ + 2 log log ^ 


-Vf rVs 


- log 


Wpp' 

P"' + P'^^ \P^ + 


In 


dp dp'. (97) 


The last term on the RHS of the above is difficult to evaluate precisely. 
However, since the integrand is positive, the double integral can be upper 
bounded by inflating the region of integration to the region 

{p,p' > 0 : 2 \og8 <p^ + < 28}. 


The integral over this larger set can be now computed analytically by chang¬ 
ing to polar coordinates to obtain 


1 . ( Ml V 
Wiv^P'+P'^ “^2 + ^/2 


dp dp' < 


4 


■log 


8 


log 8 


(98) 


where we have used the identity 


-[ Io(^sin(p) d(p = Io(^/2), (99) 

^ Jo 

which follows from [181 6.567]. Consequently, by (inii) and P|) 

Q, 0) > log log ^ ^ - log(27r) - 2 loglo (100) 

so that by p)|) 

Jim |ro(^) - loglog^l > ^ -log(27r) -2loglo • (101) 


4.2 The Cut-Off Rate in the Presence of Receiver Side 
Information 

We next consider the case where the fading H is of zero-mean (Rayleigh) 
and where the receiver has access to some side-information S that is jointly 
Gaussian with H. We assume that the pair (77, S) is independent of the 
additive noise Z and that the joint law of (77, S) and Z does not depend on 
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the channel inpnt x G C. We denote the conditional mean of H given S = s 
by 

ds = E[H\S = s] (102) 

and the estimation error by 


e 


2 A 


E 


H -d,\‘^\S 



(103) 


Note that nnconditionally, ds is a zero-mean circularly-symmetric Ganssian 
random variable of variance 1 — e^: 

4 ~Arc(0,l-e2). (104) 

Recall also that we only treat here the case > 0. Denoting the conditional 
density of (Y, S) corresponding to the inpnt x G C by tc(|/, s|x), we have by 
the independence of the side information S and the input that 

w{y, s|x) = fsis)w{y\x, s) (105) 

where fs is the density of the side information and where w{y\x,s) is the 
conditional law of Y given the input x and the side information s. Note that, 
because {H,S) are jointly Gaussian, the density w{y\x,s) is the Gaussian 
density of mean dg ■ x and variance ■ |xp -f cx^. Gonsequently, 


Eo{l,Q,r) 


= -log j ^^\An(^^7j^dQ(x)^ dfi{y) dfi{s) 

^ / (/ Vw{y\x, s) dQ(x)^ dfi{y)dfi{s) (106) 

= - log f fs{s) ■ Exp (-Eo(l, Q, r|s)) ds (107) 


where (Unni) follows from (Unsi) and where (unzi) follows by dehning 

Eo(l,Q,r|s) =-log j y' w{y\x, s) dQ(x)^ dfi{y) (108) 

as the Eq function corresponding to the channel w{y\x,s) for S = s hxed. 
(This channel is a Ricean fading channel, except that the fading is not nor¬ 
malized to have unit variance.) 
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The cut-off rate -Ro(^-|*S') in the presence of the side information S can be 
thus upper bounded by 


< sup sup <{ - log / fsis) ■ Exp (-Eo(l, Q, r|s)) ds } (109) 

Q:Eq[|X|2]<£ r>0 


< - log / fs{s) - Exp I - sup supEo(l,Q,r|s) | ds (110) 

Q:Eq[|X| 2]<S r>0 


= - log / fs{s) ■ Exp {-Ro{£\S = s)) 


( 111 ) 


where 


Ro{£\S = s) ^ sup sup Eo(l, Q, r|s) (112) 

Q:Eq[|X| 2]<£- r>0 

is the cut-off rate corresponding to power S communication over the channel 
w{y\x, s) for fixed S = s. ^ 

It now follows from onj that 


OR O 

i?o(i^|5') - loglog— < -log j fs{s) ■Exp(^-{R^{£\S = s) -\og\og—)^ 
and consequently 
lim {Ro{S\S) - log log 

£—>oo a 

- |“^°S^/s(s) ■ Exp(^-{Ro{£\S = s) - log log dsj (113) 

= - log Jim jjs{s) ■ Exp(^-{Ro{S\S = s) - loglog^)) ds (114) 
< -log^/s(s) Jim Exp(^-{Ro{S\S = s) -loglog^)j ds (115) 
= - log Jjs{s) ■ Exp(^- Ji^{i?o(£^|5' = s) - log log ds (116) 

= -log^/s(s) ■ Exp + log(27r) + 21oglo ds (117) 

= log ^ - log K Vl - j - log 4. (118) 

®This definition is consistent with fK5|l since the cost constraint on the cut-off rate is 
always active for the Ricean fading channel. 
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Here the swapping of the limit and the expectation (second inequality) is 
justihed using Patou’s lemma and we use the result 

= i^-log( 27 r)- 21 oglo (US) 

which follows from (jEHl) applied to the un-normalized Ricean fading channel 
whose specular component is ds and whose granular component is of variance 
e^. The evaluation of the last integral is based on an identity combining m 
6.612] and US 160.02] 


[ e dx =—K (, a,P>0 

Jo Tia \a J 

and the identity for the elliptic function 123 Eq. (3.2.4)] 

'''**'*” + = 0<«:,A^'<1. (120) 

In view of (HIED, to establish (HOI) it now suffices to show 

^ 1 

lim {Ro(^^|*S') - log log—} > log— - logK (Vl - - log 4. (121) 

S —*^cxD O’ 6 \ / 

To this end we note that by (unzD and (nnn) evaluated at r = 0 

Ro(^|^)>-log^/5(s)-Exp(-Eo(l,Q,0|s)) ds (122) 

for any law Q satisfying Eq[|XP] < E. We next choose, as before, Q to be a 
law under which X is circularly symmetric with 

log|X|^ ~ Uniform (log logT, logT) (123) 

whence by Proposition El and (j1 OOj) applied to the Ricean channel of fading 
mean ds and granular component and the tightness of the lower bound 

{^o(l,Q,0|s) - loglog -log(27r) -21oglo (124) 

for every s. The desired result (inn) now follows from (j1 22j) and using 
the Dominated Convergence Theorem and (HMD. 


26 













A A Lagrange Duality 


In this appendix we prove the following Lagrange duality: 

Proposition 3. For any discrete memoryless channel and any p > 0, the 
problem 

mine“-®^cK.o(e.Q) ^25) 

Q 

is a Lagrange dual of the problem 

min (126) 

Q 

where Q is a distribution on the input alphabet. In particular, since strong 
duality holds, 

maxL;G,o(p, Q) = maxEcK,o(^', Q) 

Proof. Consider a discrete memoryless channel \N{y\x) with input X E X, 
\X\ = N and output Y E y, |y| = M. We henceforth introduce the more 
standard, for optimization problems, vector notation for functions on discrete 
domains. Hence, let q G be a probability distribution on X and w G 

■^NxM ^ matrix whose (i,j)-th element is given by 

Wij = \N{yj\xi)'^, Xi E df, yj G y, p > 0. 

Hence, dnni) can be written as: 


mm 

q,f 




i+e 


s.t. qw = f, q E 0, ql = 1, 


where f G is an auxiliary vector that we introduce in this problem. 

The domain D of this optimization problem is D = {(q, f)|q E 0}. For any 
p > 0 the objective function is convex in D. Furthermore, all equality and 
inequality constraints are affine. Hence, the problem is a convex optimization 
problem. We will perform a relaxation, which is nevertheless tight for the 
optimal values of f and q, to the constraint qw = f, namely 


mm 


fY.fP 


s.t. f E qw, q E 0, ql 


1 . 


The Lagrangian function of this problem is 

L(q, f, u, /i. A) = ^ fj^^ + (qw - {)u + (1 - ql)/i - qA, 
j 
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where A ^ 0 G u ^ 0 E f G /r G M and (q, f) G D. 

Since the Lagrangian fnnction is affine with respect to q, we impose the dnal 
ineqnality constraint fil ^ wu, minimize the Lagrangian over f and obtain 
the Lagrange dnal problem 

s.t. /il ^ wu, u y 0. 

This is a concave problem, with the objective fnnction being monotonic with 
respect to all the optimization variables. Since we maximize it in a polyhe¬ 
dron, the optimnm will be on the bonndary, of maximnm distance from the 
hyperplane fi = 0 and of minimnm distance from all hyperplanes that define 
the polyhedron. Therefore, some dnal constraint has to be active, i.e., 

min Wjjh'j = /i. 

3 

Conseqnently, the dnal problem becomes 

■S?? { -«' E (l^) ' + “to IE j } ■ 

Q 

We perform the transformation of variables j = 1,...,M, 

where r G is chosen to be a probability distribution and a G M"*" is the 

appropriate normalizing scalar. Optimizing over a yields 



s.t. r ^ 0, rl = 1 

/ g \ 0+1 

which, because of the fact that concave with respect to 

Q 

r and monotonic with respect to Yhj > concludes the proof. □ 

B Proof of (ED 

Proof. We begin with the case where the cost constraint is active. Fix some 
p > 0 and let Q* and r* achieve 

max max EQ(g,Q,r) 

Q:Eq[ 3 (X)]=T r>0 
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so that 


Eo(g,Q^,r^) = max maxEof^, Q, r). (127) 

Q:EQ[g(X)]=T r>0 ' ' 

Following jH Eq. (7.3.26)] we define 

a{y) ^ Q*(2:)e’'‘(3(^)-^)W(2/|x)^, yey. (128) 

x&X 

With this definition we have by (inii) and (ESD 

max maxE'o(^, Q, r) = E^ig, Q*, r*) 

Q:Eq[ 3 (X)]=T r>0 

=-log^a^+^(2/). (129) 

y&y 

Also, by m Eq. (7.3.28)] 

''^a^{y)E’*'^^^^^~E\l\/(^y\x)^ >''^a^~^‘^{y), WxeX. (130) 

y&y y&y 

Consider now the distribution R* on y given by 

Q,l + e('y') 

We now have by (IT^ that for any distribution Q 

^ck,o( 2,Q) < -(1 + 2) X] Q(a;) log ( 5]]W(|/|a;)^R*(|/)^ j (132) 

x&x Vyey / 

and if EQ[ 5 f(W)] = T then 

EcKflig, Q) 

< —(1 + g) ^ Q(x) log I ^ R^{y)^ | 

x&x Vyey / 

= -(l + 2)X]Q(a;)log ( I + ^log^a^+^(2/) 

x&x \y&y J y&y 

< -log^a^+^(?/) 

y&y 

= max maxEo(q, Q, r). 

Q:EQ[g(X)]=T r>0 
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Here the first inequality follows from (llH2j) because the condition EQ[ 5 f(X)] = 
T guarantees that the introduction of the exponential term exp{r*( 5 f(a;) — 
T)} has zero net effect; the subsequent equality by ()131|1 : the subsequent 
inequality by (HH; and the final equality by (iniD. It thus follows upon 
taking the supremum in the above over all laws Q satisfying EQ[ 5 f(X)] = T 
that 

max max maxEr)(p,Q,r). (133) 

Q:Eq[ 3 (X)]=T ’ Q:EQ[g(X)]=T r>0 

On the other hand, by iED we obtain 

max EcKoiQ,Q) ^ max max&(^', Q, (134) 

Q:EQ[g(X)]=T ’ Q:Eq[ 3 (X)]=T r>0 ' ' 

which combines with (USSl) to prove the claim for active cost constraints. 
For the case of inactive cost constraints we have 


max EcKfiiQ, Q) < maxF;cK,o(2, Q) 

Q:Eq[<;(X)]<T Q 

= max Eg,0(2, Q) 


Here the hrst inequality follows by relaxing the constraint; the subsequent 
equality by (EU; and the final equality by (P|) . This combines with (jlUj) to 
conclude the proof. □ 


C Derivation of (|HH1) 

To derive we begin by noting that for p > 1 the integrand can be lower 
bounded by its value when a = 0 because 

(p^ + 5) ^ > (p^ + 5) % a ,6 > 0, p > 1 . 

In the region 0 < p < 1 we can use the inequality 

(p^ + h) ^ (p^ + ^) ^ , a, 5 > 0, 0 < p < 1. 

Combining the above two bounds we obtain that throughout the region of 
integration 

(p^ + 5 ) ^ > 5^ (p^ + 5 ) % a > 0, 0 < (5 < 1, 0 < p < cx) 
and hence 

i{x] a, 13,6) > 5^ ■ i{x; a = 0, P, 6). (135) 
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We next relate i{x\a = 0,/9, 5) to i{x\a = 0,/9,(5 = 0). To that end 
denote the integrand in i{x\ a = 0, /?, 5) by 


? 7 (p; x, /3,6,d) = e ^ 




p^ + 6 


In 


\d\ ■ \x\ ■ p 

Ixp + 


We now write the integral as 
i{x; a = 0, (3,5) = 


/mi (5 poo 

+ / v{p-,x,f3,S,d)dp. 

J \/mi5 


In the region p > yrr^ we have 


p^ 


> 


p"^ + 5 Y mi + 1 


nil 


and hence 


/ p{p-,x,(3,5,d)dp > 

'V^ V "^1 + 1 Jv^ 


mi 


p{p; X, (3, 5 = 0, d) dp. (136) 


We next show that when \/mi6 is small, the integral over the interval 


[0, i/mi6] is also small. Indeed, 

Ixl 


< ^, xeC 


|a;p + 2a’ 

which combines with the monotonicity of lo(-) and the fact that the argument 
to the exponential function is negative to demonstrate that 

Ml -P' 


0 < p{p-,x,(3,6 = 0,d) < Iq 


2a 


and hence that 

/ pmiS 


rf{p; X,I3,6 ^ 0, d) d() < \/m,i5 • lo 


(137) 


On the other hand a straightforward calculation demonstrates that 

/ oo poo 

p{p-., X, (3,6 = 0, d) dp > / p{p-., X, (3,6 = 0, d = 0) dp 


TT / /5(|a;P + (T^) 


/5 + \x\^ + a‘^ 


> 


Ti(3a‘^ 


2{(3 + a^) 


(138) 
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where the hrst inequality follows from the monotonicity of lo(-) and the hnal 
inequality follows from simple algebra. We thus conclude that 


a = 0, /3, 5 ) 

poo 


> 


> 


/m\6 


r]{p;x,f3,5, d)dp 


mi 


mi + 1 


p{p;x,(3,6 = 0,d) dp 


mi 


/mi5 


mi + 1 


p{p; X, {3,5 = 0, d) dp 


mi 


>0 Jo 

/mi 5 


JO 


/o ' vip',x,l3,6 = 0,d)dp 


mi + 1 I p{p-,x,{5,S = 0,d)dp 


> 


mi 


mi + 1 


v^'Io 


7r/3cr^ 


p{p; X, {3,5 = 0, d) dp 
i{x;a = 0,P,5 = 0) (139) 


2{P+a^) 


where the hrst inequality follows from the non-negativity of the integrand; 
the subsequent inequality from (USSl); and the hnal inequality from (fnTii & 
(HHED. The desired bound (jMD now follows from (USED and dlSSl)- 
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